A Paradigm for Multilingual Information Retrieval System for Indian Languages
نویسنده
چکیده
The paper presents a paradigm for a multilingual information system for Indian languages. MARC21 is taken as the basis and UNICODE is adopted for multilingual representation of bibliographic data in Indian languages. The objective of the system is to present search results in any one chosen Indian language irrespective of the original language of the bibliographic record. In other words the retrieval engine should translate or transliterate the result set on the fly. To demonstrate the system protocols and tools like Z39.50, Zebra Server, MARCXML are used and the programs are written in PHP to make the system web-based.
منابع مشابه
Multilingual Speech Recognition for Information Retrieval in Indian Context
This paper analyzes various issues in building a HMM based multilingual speech recognizer for Indian languages. The system is originally designed for Hindi and Tamil languages and adapted to incorporate Indian accented English. Language-specific characteristics in speech recognition framework are highlighted. The recognizer is embedded in information retrieval applications and hence several iss...
متن کاملIndoWordNet Dictionary: An Online Multilingual Dictionary using IndoWordNet
India is a country with diverse culture, language and varied heritage. Due to this, it is very rich in languages and their dialects. Being a multilingual society, a multilingual dictionary becomes its need and one of the major resources to support a language. There are dictionaries for many Indian languages, but very few are available in multiple languages. WordNet is one of the most prominent ...
متن کاملDomain Specific Information Retrieval in Multilingual Environment
In today’s world of globalization, local language storage and retrieval is essential for the developing nations like India. As our country is diversified by languages and only 10% of population is aware of English language, this diversity of languages is becoming barrier to understand and acquainted in digital world. It has been found that when services are provided in local languages, it has b...
متن کاملA Review on the Cross and Multilingual Information Retrieval
In this paper we explore some of the most important areas of information retrieval. In particular, Crosslingual Information Retrieval (CLIR) and Multilingual Information Retrieval (MLIR). CLIR deals with asking questions in one language and retrieving documents in different language. MLIR deals with asking questions in one or more languages and retrieving documents in one or more different lang...
متن کاملA Language-Independent Approach to Identify the Named Entities in Under-Resourced Languages and Clustering Multilingual Documents
This paper presents a language-independent Multilingual Document Clustering (MDC) approach on comparable corpora. Named entites (NEs) such as persons, locations, organizations play a major role in measuring the document similarity. We propose a method to identify these NEs present in under-resourced Indian languages (Hindi and Marathi) using the NEs present in English, which is a high resourced...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2004